En este apartado, descargaremos nuestro conjunto de datos sobre el HELOC.
HELOC DATABASE
| Name | Datos1 |
| Number of rows | 10459 |
| Number of columns | 24 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| numeric | 23 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| RiskPerformance | 0 | 1 | 3 | 4 | 0 | 2 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| ExternalRiskEstimate | 0 | 1 | 67.43 | 21.12 | -9 | 63 | 71 | 79.0 | 94 | ▁▁▁▇▆ |
| MSinceOldestTradeOpen | 0 | 1 | 184.21 | 109.68 | -9 | 118 | 178 | 249.5 | 803 | ▆▇▂▁▁ |
| MSinceMostRecentTradeOpen | 0 | 1 | 8.54 | 13.30 | -9 | 3 | 5 | 11.0 | 383 | ▇▁▁▁▁ |
| AverageMInFile | 0 | 1 | 73.84 | 38.78 | -9 | 52 | 74 | 95.0 | 383 | ▇▇▁▁▁ |
| NumSatisfactoryTrades | 0 | 1 | 19.43 | 13.00 | -9 | 12 | 19 | 27.0 | 79 | ▂▇▃▁▁ |
| NumTrades60Ever2DerogPubRec | 0 | 1 | 0.04 | 2.51 | -9 | 0 | 0 | 1.0 | 19 | ▁▇▁▁▁ |
| NumTrades90Ever2DerogPubRec | 0 | 1 | -0.14 | 2.37 | -9 | 0 | 0 | 0.0 | 19 | ▁▇▁▁▁ |
| PercentTradesNeverDelq | 0 | 1 | 86.66 | 26.00 | -9 | 87 | 96 | 100.0 | 100 | ▁▁▁▁▇ |
| MSinceMostRecentDelq | 0 | 1 | 6.76 | 20.50 | -9 | -7 | -7 | 14.0 | 83 | ▇▂▁▁▁ |
| MaxDelq2PublicRecLast12M | 0 | 1 | 4.93 | 3.76 | -9 | 4 | 6 | 7.0 | 9 | ▁▁▁▂▇ |
| MaxDelqEver | 0 | 1 | 5.51 | 3.97 | -9 | 5 | 6 | 8.0 | 8 | ▁▁▁▁▇ |
| NumTotalTrades | 0 | 1 | 20.86 | 14.58 | -9 | 12 | 20 | 29.0 | 104 | ▅▇▂▁▁ |
| NumTradesOpeninLast12M | 0 | 1 | 1.25 | 3.07 | -9 | 0 | 1 | 3.0 | 19 | ▁▇▃▁▁ |
| PercentInstallTrades | 0 | 1 | 32.17 | 20.13 | -9 | 20 | 31 | 44.0 | 100 | ▂▇▆▂▁ |
| MSinceMostRecentInqexcl7days | 0 | 1 | -0.33 | 6.07 | -9 | -7 | 0 | 1.0 | 24 | ▃▇▁▁▁ |
| NumInqLast6M | 0 | 1 | 0.87 | 3.18 | -9 | 0 | 1 | 2.0 | 66 | ▇▁▁▁▁ |
| NumInqLast6Mexcl7days | 0 | 1 | 0.81 | 3.14 | -9 | 0 | 1 | 2.0 | 66 | ▇▁▁▁▁ |
| NetFractionRevolvingBurden | 0 | 1 | 31.63 | 30.06 | -9 | 5 | 25 | 54.0 | 232 | ▇▅▁▁▁ |
| NetFractionInstallBurden | 0 | 1 | 39.16 | 42.10 | -9 | -8 | 47 | 79.0 | 471 | ▇▂▁▁▁ |
| NumRevolvingTradesWBalance | 0 | 1 | 3.19 | 4.41 | -9 | 2 | 3 | 5.0 | 32 | ▁▇▁▁▁ |
| NumInstallTradesWBalance | 0 | 1 | 0.98 | 4.06 | -9 | 1 | 2 | 3.0 | 23 | ▂▇▂▁▁ |
| NumBank2NatlTradesWHighUtilization | 0 | 1 | 0.02 | 3.36 | -9 | 0 | 0 | 1.0 | 18 | ▂▇▃▁▁ |
| PercentTradesWBalance | 0 | 1 | 62.08 | 27.71 | -9 | 47 | 67 | 82.0 | 100 | ▂▂▆▇▇ |
Obtener un resumen estadístico de todas las variables.
| RiskPerformance | ExternalRiskEstimate | MSinceOldestTradeOpen | MSinceMostRecentTradeOpen | AverageMInFile | NumSatisfactoryTrades | NumTrades60Ever2DerogPubRec | NumTrades90Ever2DerogPubRec | PercentTradesNeverDelq | MSinceMostRecentDelq | MaxDelq2PublicRecLast12M | MaxDelqEver | NumTotalTrades | NumTradesOpeninLast12M | PercentInstallTrades | MSinceMostRecentInqexcl7days | NumInqLast6M | NumInqLast6Mexcl7days | NetFractionRevolvingBurden | NetFractionInstallBurden | NumRevolvingTradesWBalance | NumInstallTradesWBalance | NumBank2NatlTradesWHighUtilization | PercentTradesWBalance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Length:10459 | Min. :-9.00 | Min. : -9.0 | Min. : -9.000 | Min. : -9.00 | Min. :-9.00 | Min. :-9.00000 | Min. :-9.0000 | Min. : -9.00 | Min. :-9.000 | Length:10459 | Length:10459 | Min. : -9.00 | Min. :-9.000 | Min. : -9.00 | Min. :-9.0000 | Min. :-9.0000 | Min. :-9.0000 | Min. : -9.00 | Min. : -9.00 | Min. : -9.00 | Min. :-9.0000 | Min. :-9.00000 | Min. : -9.00 | |
| Class :character | 1st Qu.:63.00 | 1st Qu.:118.0 | 1st Qu.: 3.000 | 1st Qu.: 52.00 | 1st Qu.:12.00 | 1st Qu.: 0.00000 | 1st Qu.: 0.0000 | 1st Qu.: 87.00 | 1st Qu.:-7.000 | Class :character | Class :character | 1st Qu.: 12.00 | 1st Qu.: 0.000 | 1st Qu.: 20.00 | 1st Qu.:-7.0000 | 1st Qu.: 0.0000 | 1st Qu.: 0.0000 | 1st Qu.: 5.00 | 1st Qu.: -8.00 | 1st Qu.: -8.00 | 1st Qu.: 1.0000 | 1st Qu.: 0.00000 | 1st Qu.: 47.00 | |
| Mode :character | Median :71.00 | Median :178.0 | Median : 5.000 | Median : 74.00 | Median :19.00 | Median : 0.00000 | Median : 0.0000 | Median : 96.00 | Median :-7.000 | Mode :character | Mode :character | Median : 20.00 | Median : 1.000 | Median : 31.00 | Median : 0.0000 | Median : 1.0000 | Median : 1.0000 | Median : 25.00 | Median : 47.00 | Median : 47.00 | Median : 2.0000 | Median : 0.00000 | Median : 67.00 | |
| NA | Mean :67.43 | Mean :184.2 | Mean : 8.543 | Mean : 73.84 | Mean :19.43 | Mean : 0.04274 | Mean :-0.1428 | Mean : 86.66 | Mean : 6.762 | NA | NA | Mean : 20.86 | Mean : 1.253 | Mean : 32.17 | Mean :-0.3254 | Mean : 0.8681 | Mean : 0.8126 | Mean : 31.63 | Mean : 39.16 | Mean : 39.16 | Mean : 0.9761 | Mean : 0.01807 | Mean : 62.08 | |
| NA | 3rd Qu.:79.00 | 3rd Qu.:249.5 | 3rd Qu.: 11.000 | 3rd Qu.: 95.00 | 3rd Qu.:27.00 | 3rd Qu.: 1.00000 | 3rd Qu.: 0.0000 | 3rd Qu.:100.00 | 3rd Qu.:14.000 | NA | NA | 3rd Qu.: 29.00 | 3rd Qu.: 3.000 | 3rd Qu.: 44.00 | 3rd Qu.: 1.0000 | 3rd Qu.: 2.0000 | 3rd Qu.: 2.0000 | 3rd Qu.: 54.00 | 3rd Qu.: 79.00 | 3rd Qu.: 79.00 | 3rd Qu.: 3.0000 | 3rd Qu.: 1.00000 | 3rd Qu.: 82.00 | |
| NA | Max. :94.00 | Max. :803.0 | Max. :383.000 | Max. :383.00 | Max. :79.00 | Max. :19.00000 | Max. :19.0000 | Max. :100.00 | Max. :83.000 | NA | NA | Max. :104.00 | Max. :19.000 | Max. :100.00 | Max. :24.0000 | Max. :66.0000 | Max. :66.0000 | Max. :232.00 | Max. :471.00 | Max. :471.00 | Max. :23.0000 | Max. :18.00000 | Max. :100.00 |
Se realizara el histograma para la variable categórica:
Se realizaran todos los histogramas de todas las variables numéricas:
Aquí se realizaran distintos histogramas para todas la variables numéricas pero distinguiendo con los tipos que hay dentro de la variables categórica:
Las pruebas estadísticas comunes para verificar la normalidad incluyen:
Prueba de Shapiro-Wilk
| Variable | P_Value | |
|---|---|---|
| PercentInstallTrades | PercentInstallTrades | 2.028e-24 |
| NumSatisfactoryTrades | NumSatisfactoryTrades | 1.632e-26 |
| MSinceOldestTradeOpen | MSinceOldestTradeOpen | 1.193e-26 |
| NumTotalTrades | NumTotalTrades | 2.315e-27 |
| AverageMInFile | AverageMInFile | 2.113e-29 |
| NetFractionRevolvingBurden | NetFractionRevolvingBurden | 1.985e-41 |
| PercentTradesWBalance | PercentTradesWBalance | 1.532e-43 |
| MSinceMostRecentInqexcl7days | MSinceMostRecentInqexcl7days | 7.017e-56 |
| NetFractionInstallBurden | NetFractionInstallBurden | 1.774e-58 |
| NumRevolvingTradesWBalance | NumRevolvingTradesWBalance | 3.578e-59 |
| NumTradesOpeninLast12M | NumTradesOpeninLast12M | 4.082e-64 |
| NumInstallTradesWBalance | NumInstallTradesWBalance | 2.141e-66 |
| MSinceMostRecentDelq | MSinceMostRecentDelq | 1.119e-66 |
| NumBank2NatlTradesWHighUtilization | NumBank2NatlTradesWHighUtilization | 4.080e-68 |
| NumInqLast6M | NumInqLast6M | 2.471e-68 |
| NumInqLast6Mexcl7days | NumInqLast6Mexcl7days | 5.447e-69 |
| ExternalRiskEstimate | ExternalRiskEstimate | 1.891e-70 |
| MSinceMostRecentTradeOpen | MSinceMostRecentTradeOpen | 1.277e-75 |
| NumTrades60Ever2DerogPubRec | NumTrades60Ever2DerogPubRec | 2.279e-77 |
| PercentTradesNeverDelq | PercentTradesNeverDelq | 3.385e-78 |
| NumTrades90Ever2DerogPubRec | NumTrades90Ever2DerogPubRec | 1.179e-80 |
Visualmente, puedes usar histogramas, gráficos Q-Q, y gráficos de densidad para evaluar la normalidad.
Prueba de Shapiro-Wilk: Verifica si cada variable numérica sigue una distribución normal. Los valores p menores a 0.05 indican desviación significativa de la normalidad.
Histograma con Densidad: Muestra la distribución de la variable con una curva de densidad para visualizar la normalidad.
Gráfico Q-Q: Compara los cuantiles de la variable con los cuantiles de una distribución normal teórica. Los puntos deberían alinearse con la línea si los datos son normales.